Conversation
|
✅ DCO Check Passed Thanks @Nintorac, all your commits are properly signed off. 🎉 |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 Require two reviewer for test updatesThis rule is failing.When test data is updated, we require two reviewers
🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
|
@Nintorac this looks interesting, but we would need to validate the dependency impact it brings to docling-core. We are trying to keep docling-core dependencies light and make them optional where possible. Do you think Also, please note we have created a layer for handling workloads on non-local file storage already with docling-jobkit. It implements a number of connectors for different cloud storage. |
|
No need for any further dependencies, the main changes here are to make sure all filesystem calls run through pathlib, UPath implements the pathlib api - so tests also only need to test that it works with pathlib.Path I think if this lib were to implement anything it would be optional and at the fsspec level but this is a simple enough way to get compatibility without extra dependencies and minimal code changes. In case you haven't gone down the rabbit hole fsspec is quite well supported - |
|
Ah sorry, didn't mean to send the last message from that account |
ceberam
left a comment
There was a problem hiding this comment.
Thanks @Nintorac for creating this PR.
Please review the contributing guidelines and ensure that the PR follows the conventional commits and that the code checks pass.
|
@Nintorac This is an interesting PR, for finalizing we need
Additionally, (optional) we could add some tests which uses the |
dd5e1c2 to
0ab6466
Compare
|
Sorry, got a little busy. I have fixed the commit message and added some testing :) |
Unfortunately the commit message still doesn't contain the sign-off line. FYI, we usually add it by doing the commit with |
0ab6466 to
fe08a85
Compare
|
oh my bad, fixed. I also couldn't get the mypy passing on my machine but it didn't look related to my changes. |
@Nintorac there are indeed type checks that fail, as shown by Mypy. Most probably they are caused by the newly introduced assignment: which makes Mypy infer that Please, make sure to resolve all the code checks before pushing new commits. I strongly recommend installing |
Enable cloud storage backends (S3, GCS, Azure, etc.) for document serialization methods via fsspec/universal-pathlib integration. Signed-off-by: Nintorac <Nintorac@users.noreply.github.com>
fe08a85 to
2518770
Compare
|
ah, yea I see - I didn't have treesitter/some other optional deps which were expected by mypy which added some irrelevant errors and disguised mine |
|
ah, yea I see - I didn't have treesitter/some other optional deps which were expected by mypy |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Great project! Thank you :)
Here is a quick and dirty AI sketchup to resolve #443 - still need some tests and polish, but wanted to get some initial feedback. Is this something you're interested in? Is there anything missed?
Cheers :)
some manual test code, instal upath with
pip install universal_pathlib